{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "28d433f8",
   "metadata": {},
   "source": [
    "<a href=\"https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/llama_datasets/downloading_llama_datasets.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "493bd368-58ab-4558-9b26-fad7fda7170b",
   "metadata": {},
   "source": [
    "# Downloading a LlamaDataset from LlamaHub\n",
    "\n",
    "You can browse our available benchmark datasets via [llamahub.ai](https://llamahub.ai/). This notebook guide depicts how you can download the dataset and its source text documents. In particular, the `download_llama_dataset` will download the evaluation dataset (i.e., `LabelledRagDataset`) as well as the `Document`'s of the source text files used to build the evaluation dataset in the first place.\n",
    "\n",
    "Finally, in this notebook, we also demonstrate the end to end workflow of downloading an evaluation dataset, making predictions on it using your own RAG pipeline (query engine) and then evaluating these predictions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "985ce533-1dc4-4c17-a39f-ae17aff40f48",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "github url: https://raw.githubusercontent.com/nerdai/llama-hub/datasets/llama_hub/llama_datasets/library.json\n",
      "github url: https://media.githubusercontent.com/media/run-llama/llama_datasets/main/llama_datasets/paul_graham_essay/rag_dataset.json\n",
      "github url: https://media.githubusercontent.com/media/run-llama/llama_datasets/main/llama_datasets/paul_graham_essay/source.txt\n"
     ]
    }
   ],
   "source": [
    "from llama_index.llama_dataset import download_llama_dataset\n",
    "\n",
    "# download and install dependencies\n",
    "rag_dataset, documents = download_llama_dataset(\n",
    "    \"PaulGrahamEssayDataset\", \"./paul_graham\"\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ba51e959-d71b-467f-97c1-ce42a18c41a7",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>query</th>\n",
       "      <th>reference_contexts</th>\n",
       "      <th>reference_answer</th>\n",
       "      <th>reference_answer_by</th>\n",
       "      <th>query_by</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>In the essay, the author mentions his early ex...</td>\n",
       "      <td>[What I Worked On\\n\\nFebruary 2021\\n\\nBefore c...</td>\n",
       "      <td>The first computer the author used for program...</td>\n",
       "      <td>ai (gpt-4)</td>\n",
       "      <td>ai (gpt-4)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>The author switched his major from philosophy ...</td>\n",
       "      <td>[What I Worked On\\n\\nFebruary 2021\\n\\nBefore c...</td>\n",
       "      <td>The two specific influences that led the autho...</td>\n",
       "      <td>ai (gpt-4)</td>\n",
       "      <td>ai (gpt-4)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>In the essay, the author discusses his initial...</td>\n",
       "      <td>[I couldn't have put this into words when I wa...</td>\n",
       "      <td>The two main influences that initially drew th...</td>\n",
       "      <td>ai (gpt-4)</td>\n",
       "      <td>ai (gpt-4)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>The author mentions his shift of interest towa...</td>\n",
       "      <td>[I couldn't have put this into words when I wa...</td>\n",
       "      <td>The author shifted his interest towards Lisp a...</td>\n",
       "      <td>ai (gpt-4)</td>\n",
       "      <td>ai (gpt-4)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>In the essay, the author mentions his interest...</td>\n",
       "      <td>[So I looked around to see what I could salvag...</td>\n",
       "      <td>The author in the essay is Paul Graham, who wa...</td>\n",
       "      <td>ai (gpt-4)</td>\n",
       "      <td>ai (gpt-4)</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                               query  \\\n",
       "0  In the essay, the author mentions his early ex...   \n",
       "1  The author switched his major from philosophy ...   \n",
       "2  In the essay, the author discusses his initial...   \n",
       "3  The author mentions his shift of interest towa...   \n",
       "4  In the essay, the author mentions his interest...   \n",
       "\n",
       "                                  reference_contexts  \\\n",
       "0  [What I Worked On\\n\\nFebruary 2021\\n\\nBefore c...   \n",
       "1  [What I Worked On\\n\\nFebruary 2021\\n\\nBefore c...   \n",
       "2  [I couldn't have put this into words when I wa...   \n",
       "3  [I couldn't have put this into words when I wa...   \n",
       "4  [So I looked around to see what I could salvag...   \n",
       "\n",
       "                                    reference_answer reference_answer_by  \\\n",
       "0  The first computer the author used for program...          ai (gpt-4)   \n",
       "1  The two specific influences that led the autho...          ai (gpt-4)   \n",
       "2  The two main influences that initially drew th...          ai (gpt-4)   \n",
       "3  The author shifted his interest towards Lisp a...          ai (gpt-4)   \n",
       "4  The author in the essay is Paul Graham, who wa...          ai (gpt-4)   \n",
       "\n",
       "     query_by  \n",
       "0  ai (gpt-4)  \n",
       "1  ai (gpt-4)  \n",
       "2  ai (gpt-4)  \n",
       "3  ai (gpt-4)  \n",
       "4  ai (gpt-4)  "
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "rag_dataset.to_pandas()[:5]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3bb4507f-3489-42bc-a5ef-83cf41e8bcd0",
   "metadata": {},
   "source": [
    "With `documents`, you can build your own RAG pipeline, to then predict and perform evaluations to compare against the benchmarks listed in the `DatasetCard` associated with the datasets [llamahub.ai](https://llamahub.ai/)."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0fdf85fb-69c1-41a8-b445-16a244c7b4be",
   "metadata": {},
   "source": [
    "### Predictions\n",
    "\n",
    "**NOTE**: The rest of the notebook illustrates how to manually perform predictions and subsequent evaluations for demonstrative purposes. Alternatively you can use the `RagEvaluatorPack` that will take care of predicting and evaluating using a RAG system that you would have provided."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "95fac3c9-8220-43e6-84b7-b8401a16a38f",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index import VectorStoreIndex\n",
    "\n",
    "# a basic RAG pipeline, uses service context defaults\n",
    "index = VectorStoreIndex.from_documents(documents=documents)\n",
    "query_engine = index.as_query_engine()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c402b3f9-add2-4df1-87c8-445224ad0cd7",
   "metadata": {},
   "source": [
    "You can now create predictions and perform evaluation manually or download the `PredictAndEvaluatePack` to do this for you in a single line of code."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5c4ac65d-331f-475b-ab15-2990f3d2af6d",
   "metadata": {},
   "outputs": [],
   "source": [
    "import nest_asyncio\n",
    "\n",
    "nest_asyncio.apply()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "40eee420-f5fd-439c-9b2e-fc287372ea53",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "100%|███████████████████████████████████████████████████████| 44/44 [00:08<00:00,  4.90it/s]\n"
     ]
    }
   ],
   "source": [
    "# manually\n",
    "prediction_dataset = await rag_dataset.amake_predictions_with(\n",
    "    query_engine=query_engine, show_progress=True\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "84cf298a-cb5e-46d1-b5ae-e07fa5a13ae4",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>response</th>\n",
       "      <th>contexts</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>The author mentions that the first computer he...</td>\n",
       "      <td>[What I Worked On\\n\\nFebruary 2021\\n\\nBefore c...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>The author switched his major from philosophy ...</td>\n",
       "      <td>[I couldn't have put this into words when I wa...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>The author mentions two main influences that i...</td>\n",
       "      <td>[I couldn't have put this into words when I wa...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>The author mentions that he shifted his intere...</td>\n",
       "      <td>[So I looked around to see what I could salvag...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>The author mentions his interest in both compu...</td>\n",
       "      <td>[What I Worked On\\n\\nFebruary 2021\\n\\nBefore c...</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                            response  \\\n",
       "0  The author mentions that the first computer he...   \n",
       "1  The author switched his major from philosophy ...   \n",
       "2  The author mentions two main influences that i...   \n",
       "3  The author mentions that he shifted his intere...   \n",
       "4  The author mentions his interest in both compu...   \n",
       "\n",
       "                                            contexts  \n",
       "0  [What I Worked On\\n\\nFebruary 2021\\n\\nBefore c...  \n",
       "1  [I couldn't have put this into words when I wa...  \n",
       "2  [I couldn't have put this into words when I wa...  \n",
       "3  [So I looked around to see what I could salvag...  \n",
       "4  [What I Worked On\\n\\nFebruary 2021\\n\\nBefore c...  "
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "prediction_dataset.to_pandas()[:5]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d19d5d9a-d973-4964-86a4-2f57aac96482",
   "metadata": {},
   "source": [
    "### Evaluation\n",
    "\n",
    "Now that we have our predictions, we can perform evaluations on two dimensions:\n",
    "\n",
    "1. The generated response: how well the predicted response matches the reference answer.\n",
    "2. The retrieved contexts: how well the retrieved contexts for the prediction match the reference contexts.\n",
    "\n",
    "NOTE: For retrieved contexts, we are unable to use standard retrieval metrics such as `hit rate` and `mean reciproccal rank` due to the fact that doing so requires we have the same index that was used to generate the ground truth data. But, it is not necessary for a `LabelledRagDataset` to be even created by an index. As such, we will use `semantic similarity` between the prediction's contexts and the reference contexts as a measure of goodness."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c8643a3a-b726-4dd2-804e-91998f7bc8c3",
   "metadata": {},
   "outputs": [],
   "source": [
    "import tqdm"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f76949c6-a7ca-4bf6-ae09-89535adec80e",
   "metadata": {},
   "source": [
    "For evaluating the response, we will use the LLM-As-A-Judge pattern. Specifically, we will use `CorrectnessEvaluator`, `FaithfulnessEvaluator` and `RelevancyEvaluator`.\n",
    "\n",
    "For evaluating the goodness of the retrieved contexts we will use `SemanticSimilarityEvaluator`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "03f665dc-ceea-4138-8ef1-b63b8179cef4",
   "metadata": {},
   "outputs": [],
   "source": [
    "# instantiate the gpt-4 judge\n",
    "from llama_index.llms import OpenAI\n",
    "from llama_index import ServiceContext\n",
    "from llama_index.evaluation import (\n",
    "    CorrectnessEvaluator,\n",
    "    FaithfulnessEvaluator,\n",
    "    RelevancyEvaluator,\n",
    "    SemanticSimilarityEvaluator,\n",
    ")\n",
    "\n",
    "judges = {}\n",
    "\n",
    "judges[\"correctness\"] = CorrectnessEvaluator(\n",
    "    service_context=ServiceContext.from_defaults(\n",
    "        llm=OpenAI(temperature=0, model=\"gpt-4\"),\n",
    "    )\n",
    ")\n",
    "\n",
    "judges[\"relevancy\"] = RelevancyEvaluator(\n",
    "    service_context=ServiceContext.from_defaults(\n",
    "        llm=OpenAI(temperature=0, model=\"gpt-4\"),\n",
    "    )\n",
    ")\n",
    "\n",
    "judges[\"faithfulness\"] = FaithfulnessEvaluator(\n",
    "    service_context=ServiceContext.from_defaults(\n",
    "        llm=OpenAI(temperature=0, model=\"gpt-4\"),\n",
    "    )\n",
    ")\n",
    "\n",
    "judges[\"semantic_similarity\"] = SemanticSimilarityEvaluator(\n",
    "    service_context=ServiceContext.from_defaults()\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2feb3042-6b06-409b-86ea-581500d35ff5",
   "metadata": {},
   "source": [
    "Loop through the (`labelled_example`, `prediction`) pais and perform the evaluations on each of them individually."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8edf32c2-efe2-4ccb-b9b9-a441a1e317d2",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "44it [07:15,  9.90s/it]\n"
     ]
    }
   ],
   "source": [
    "evals = {\n",
    "    \"correctness\": [],\n",
    "    \"relevancy\": [],\n",
    "    \"faithfulness\": [],\n",
    "    \"context_similarity\": [],\n",
    "}\n",
    "\n",
    "for example, prediction in tqdm.tqdm(\n",
    "    zip(rag_dataset.examples, prediction_dataset.predictions)\n",
    "):\n",
    "    correctness_result = judges[\"correctness\"].evaluate(\n",
    "        query=example.query,\n",
    "        response=prediction.response,\n",
    "        reference=example.reference_answer,\n",
    "    )\n",
    "\n",
    "    relevancy_result = judges[\"relevancy\"].evaluate(\n",
    "        query=example.query,\n",
    "        response=prediction.response,\n",
    "        contexts=prediction.contexts,\n",
    "    )\n",
    "\n",
    "    faithfulness_result = judges[\"faithfulness\"].evaluate(\n",
    "        query=example.query,\n",
    "        response=prediction.response,\n",
    "        contexts=prediction.contexts,\n",
    "    )\n",
    "\n",
    "    semantic_similarity_result = judges[\"semantic_similarity\"].evaluate(\n",
    "        query=example.query,\n",
    "        response=\"\\n\".join(prediction.contexts),\n",
    "        reference=\"\\n\".join(example.reference_contexts),\n",
    "    )\n",
    "\n",
    "    evals[\"correctness\"].append(correctness_result)\n",
    "    evals[\"relevancy\"].append(relevancy_result)\n",
    "    evals[\"faithfulness\"].append(faithfulness_result)\n",
    "    evals[\"context_similarity\"].append(semantic_similarity_result)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8396ee85-f1f8-43bd-90ee-8ee87de3f21f",
   "metadata": {},
   "outputs": [],
   "source": [
    "import json\n",
    "\n",
    "# saving evaluations\n",
    "evaluations_objects = {\n",
    "    \"context_similarity\": [e.dict() for e in evals[\"context_similarity\"]],\n",
    "    \"correctness\": [e.dict() for e in evals[\"correctness\"]],\n",
    "    \"faithfulness\": [e.dict() for e in evals[\"faithfulness\"]],\n",
    "    \"relevancy\": [e.dict() for e in evals[\"relevancy\"]],\n",
    "}\n",
    "\n",
    "with open(\"evaluations.json\", \"w\") as json_file:\n",
    "    json.dump(evaluations_objects, json_file)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e6899a20-10d0-4c1a-9403-b5ae811dcde6",
   "metadata": {},
   "source": [
    "Now, we can use our notebook utility functions to view these evaluations."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ca9c62bb-d2c7-4f5d-a491-8dd25614baf6",
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "from llama_index.evaluation.notebook_utils import (\n",
    "    get_eval_results_df,\n",
    ")\n",
    "\n",
    "deep_eval_df, mean_correctness_df = get_eval_results_df(\n",
    "    [\"base_rag\"] * len(evals[\"correctness\"]),\n",
    "    evals[\"correctness\"],\n",
    "    metric=\"correctness\",\n",
    ")\n",
    "deep_eval_df, mean_relevancy_df = get_eval_results_df(\n",
    "    [\"base_rag\"] * len(evals[\"relevancy\"]),\n",
    "    evals[\"relevancy\"],\n",
    "    metric=\"relevancy\",\n",
    ")\n",
    "_, mean_faithfulness_df = get_eval_results_df(\n",
    "    [\"base_rag\"] * len(evals[\"faithfulness\"]),\n",
    "    evals[\"faithfulness\"],\n",
    "    metric=\"faithfulness\",\n",
    ")\n",
    "_, mean_context_similarity_df = get_eval_results_df(\n",
    "    [\"base_rag\"] * len(evals[\"context_similarity\"]),\n",
    "    evals[\"context_similarity\"],\n",
    "    metric=\"context_similarity\",\n",
    ")\n",
    "\n",
    "mean_scores_df = pd.concat(\n",
    "    [\n",
    "        mean_correctness_df.reset_index(),\n",
    "        mean_relevancy_df.reset_index(),\n",
    "        mean_faithfulness_df.reset_index(),\n",
    "        mean_context_similarity_df.reset_index(),\n",
    "    ],\n",
    "    axis=0,\n",
    "    ignore_index=True,\n",
    ")\n",
    "mean_scores_df = mean_scores_df.set_index(\"index\")\n",
    "mean_scores_df.index = mean_scores_df.index.set_names([\"metrics\"])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ae2bf5d8-dafa-4674-86e8-2c04ba721a92",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th>rag</th>\n",
       "      <th>base_rag</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>metrics</th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>mean_correctness_score</th>\n",
       "      <td>4.238636</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>mean_relevancy_score</th>\n",
       "      <td>0.977273</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>mean_faithfulness_score</th>\n",
       "      <td>0.977273</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>mean_context_similarity_score</th>\n",
       "      <td>0.933568</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "rag                            base_rag\n",
       "metrics                                \n",
       "mean_correctness_score         4.238636\n",
       "mean_relevancy_score           0.977273\n",
       "mean_faithfulness_score        0.977273\n",
       "mean_context_similarity_score  0.933568"
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mean_scores_df"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "53171536-7562-4cec-ad79-fa091939b0a1",
   "metadata": {},
   "source": [
    "On this toy example, we see that the basic RAG pipeline performs quite well against the evaluation benchmark (`rag_dataset`)! For completeness, to perform the above steps instead by using the `RagEvaluatorPack`, use the code provided below:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b1613635-82e3-4359-bc39-fe63acf745a9",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.llama_pack import download_llama_pack\n",
    "\n",
    "RagEvaluatorPack = download_llama_pack(\"RagEvaluatorPack\", \"./pack\")\n",
    "rag_evaluator = RagEvaluatorPack(\n",
    "    query_engine=query_engine, rag_dataset=rag_dataset, show_progress=True\n",
    ")\n",
    "\n",
    "############################################################################\n",
    "# NOTE: If have a lower tier subscription for OpenAI API like Usage Tier 1 #\n",
    "# then you'll need to use different batch_size and sleep_time_in_seconds.  #\n",
    "# For Usage Tier 1, settings that seemed to work well were batch_size=5,   #\n",
    "# and sleep_time_in_seconds=15 (as of December 2023.)                      #\n",
    "############################################################################\n",
    "\n",
    "benchmark_df = await rag_evaluator_pack.arun(\n",
    "    batch_size=20,  # batches the number of openai api calls to make\n",
    "    sleep_time_in_seconds=1,  # seconds to sleep before making an api call\n",
    ")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "llama_index_3.10",
   "language": "python",
   "name": "llama_index_3.10"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
